Policy Invariance under Reward Transformations for General-Sum Stochastic Games

نویسندگان

Xiaosong Lu

Howard M. Schwartz

Sidney Nascimento Givigi

چکیده

We extend the potential-based shapingmethod fromMarkov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cooperative Benefit and Cost Games under Fairness Concerns

Solution concepts in cooperative games are based on either cost games or benefit games. Although cost games and benefit games are strategically equivalent, that is not the case in general for solution concepts. Motivated by this important observation, a new property called invariance property with respect to benefit/cost allocation is introduced in this paper. Since such a property can be regar...

متن کامل

R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complete, but possibly inaccurate model of its environment and acts based on the optimal policy derived from this model. The model is initialized in an optimistic fashion: all actions in all states return the maximal possible...

متن کامل

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermediate reward to be the difference between our score and our opponent’s score; the “true” reward of a win, loss, or tie is determined at the end of a game by applying a threshold function to the cumulative intermediate re...

متن کامل

Monetary and Fiscal Policy Interaction in Iran: A Dynamic Stochastic General Equilibrium Approach

Achieving the goals of price stability, sustainable economic growth, and the improvement of many economic variables require coordination between the monetary and financial authorities. In this study, a new modified Keynesian stochastic dynamic equilibrium general equilibrium model is introduced for Iran and in the framework of game theory, optimal policy of fiscal and monetary authorities are d...

متن کامل

Definable Zero-Sum Stochastic Games

Definable zero-sum stochastic games involve a finite number of states and action sets, reward and transition functions that are definable in an o-minimal structure. Prominent examples of such games are finite, semi-algebraic or globally subanalytic stochastic games. We prove that the Shapley operator of any definable stochastic game with separable transition and reward functions is definable in...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. Artif. Intell. Res.

دوره 41 شماره

صفحات -

تاریخ انتشار 2011

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

نویسندگان

چکیده

منابع مشابه

Cooperative Benefit and Cost Games under Fairness Concerns

R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

Monetary and Fiscal Policy Interaction in Iran: A Dynamic Stochastic General Equilibrium Approach

Definable Zero-Sum Stochastic Games

عنوان ژورنال:

اشتراک گذاری